handling redirects
server-side redirect, depending on how it is handled, can be easily traversed by Python’s urllib library without any help from Selenium;
client-side redirects won’t be handled at all unless something is actually executing the javascript.
selenium is capable of handling these Javascript redirects in the same way; when to stop page execution? how to tell when a page is done rediecting?
Detect that redirected in a clever way by watcing an element in the DOM when the page initially loads, then repeatedly calling the element until Selenium throws a StaleElementReferenceException, the element is no longer attached to the page’s DOM and the site has redirected.
Image Processing and Text Recognition
Pillow
Pillow allows you to easily import and manipulate images iwth a variety of filters, masks, and even pixel-specifc transformations.
from PIL import import Image, ImageFilter
kitten = Image.open(“kitten.jpg”)
blurryKitten = kitten.filter(ImageFilter.GaussianBlur)
blurryKitten.save(“kitten_blurred.jpg”)
blurrykitten.show()
for more useful, http://pillow.readthedocs.org/
Tesseract
scrape text from images on webste.